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Abstract 

We  study  an  acoustic  model  of  the  inverse  problem  of  reflection  seismology.  A 
straightforward  best-fit  formulation  of  this  problem  has  been  used  by  several  investiga¬ 
tors  as  the  basis  for  numerical  solution  of  this  problem,  but  typically  fails  when  coupled 
with  local  optimization  algorithms.  We  give  an  explanation  for  this  failure,  and  suggest 
a  relaxation  of  the  best-fit  formulation  which  may  be  more  amenable  to  quasi-Newton 
optimization.  Our  analysis  relies  on  linearization  and  on  a  high-frequency  (Fourier 
Integral)  approximation  to  the  scattered  field. 


1  Introduction 


The  constant-density  linear  acoustic  model  of  small  amplitude  wave  motion  in  a  fluid  con¬ 
nects  the  pressure  field  p(x,t),  the  sound  velocity  field  v(x),  and  a  body  force  divergence 
(“source”)  f(x,t )  through  the  wave  equation 


with  appropriate  side  conditions.  Regarding  the  source  term  /  as  known,  we  study  the 
dependence  of  p  on  v,  with  the  aim  of  understanding  the  inverse  problem,  i.e.  the  inference 
of  v  from  a  sampling  of  p  at  receiver  locations.  When  both  sources  (i.e.  the  support  of  /) 
and  receivers  are  separated  from  the  target  heterogeneities  in  v  by  a  hyperplane,  it’s  natural 
to  call  this  inference  the  reflection  inverse  problem. 

The  study  of  such  problems  originated  in  exploration  geophysics,  and  much  insight  can 
be  gleaned  form  the  literature  of  that  subject. 

It  is  natural  to  approach  this  inverse  problem  by  developing  an  objective  function,  the 
minimization  of  which  gives  an  estimate  of  the  unknown  coefficient  v.  In  this  paper  we 
study  two  choices  of  objective  function.  The  first,  a  straightforward  mean-square  data  error 
measure,  has  been  the  most  common  choice  for  numerical  work,  but  leads  to  intractable 
computations:  the  objective  function  is  essentially  non-smooth,  and  non-convex.  We  explain 


this  pathology,  and  show  how  enlargement  of  the  model  space  and  addition  of  suitable  con- 
straixrts  may  lead  to  a  smooth  optimization  problem,  with  qualitatively  better-conditioned 
quadratic  approximations  and  at  least  local  convergence  near  consistent  minima. 

The  arguments  in  this  paper  are  for  the  most  part  formal,  and  based  on  some  drastic 
approximations.  Detailed  mathematical  treatment  and  extensive  numerical  experiments  for 
the  special  case  of  plane- wave  source  and  layered  medium  (i.e.  v  depending  only  on  one 
coordinate)  appear  in  Santosa  and  Symes  [12],  Symes  [13],  [16]  and  Symes  and  Carazzone 
[17],  [18].  Some  further  discussion  on  point  sources  and  general  media,  as  discussed  here, 
appears  in  Symes  [14]  [15].  These  latter  references  include  numerical  experiments  with  the 
(second)  objective  functional  introduced  below. 

2  The  Model 

For  the  purposes  of  this  paper,  the  data  of  the  inverse  problem  is  the  trace  of  p  on  the 
hypersurface  { xn  —  0},  multiplied  by  a  smooth  function  of  compact  support.  The  source 
/  is  of  point  support  in  the  space  variables,  and  we  allow  it  to  move  in  the  plane  {a:  = 

(x  ,  Xn)  .  Xn  —  2S}. 

f(x,t)  =  F(t)6(x  -  x„)  x3in  =  zs  . 

Thus  p  depends  on  xs  as  well.  We  assume  that  the  source  is  quasi-impulsive,  i.e.  F(t )  = 
<$(/)+  a  smooth  perturbation.  Finally,  we  assume  that  v  is  constant,  v  =  0,  in  the  half-space 
{x  :  xn  <  zb},  0  <  zs  <  zb. 

These  assumptions  violate  a  number  of  limitations  of  sampling  and  bandwidth  important 
in  the  treatment  of  real-world  data,  e.g.  in  reflection  seismology,  but  in  our  judgement  leave 
the  mathematical  heart  of  the  problem  intact. 

Thus  finally 

%]  :=  P  U„=o 

is  the  data  of  the  inverse  problem.  Since  our  study  is  modeled  on  reflection  seismology,  we 
call  S  the  seismogram. 

The  properties  of  the  map  v  ^  S  are  at  present  only  poorly  understood,  so  we  study 
instead  an  approximation,  obtained  by  splitting  v  =  vb  4-  vr  into  a  smooth  background 
velocity  vb  and  a  rough  or  oscillatory  perturbation  vr.  Using  regular  first-order  perturbation 
theory  we  write 

5  “  Sb  +  Sr 

where  Sb  is  the  background  seismogram,  i.e.  the  trace  p  |Xn=o  with  v  —  vb ,  and  Sr  is  the 
perturbation  due  to  vr.  A  great  deal  of  numerical  evidence  indicates  that  this  approximation 
is  quite  accurate  so  long  as  vT  is  oscillatory.  For  a  theoretical  study  in  the  one-dimensional 
case  see  Lewis  [7].  If  vb  is  sufficiently  smooth,  which  we  assume,  then  Sb  consists  of  the 
direct  wave,  plus  possibly  refractions.  We  limit  our  attention  to  reflections  here;  so  we 
assume  that  Sb  has  been  subtracted  out  (a  nontrivial  step  in  practice!)  and  identify  S  with 

Sr- 

Sr  is  thus  the  sampling  at  the  receiver  locations  of  the  pressure  field  perturbation  6p, 
which  solves 

1 

~2  qjo  *  "p  —  »  y 

at1  vb 

plus  appropriate  side  conditions.  Note  the  appearance  of  the  reflectivity  r  =  vr/vb ;  in  the 
sequal  we  shall  use  it  rather  than  vr  to  represent  the  rough  part  of  the  model. 


d2bp  „2c _ 2vr  2 


As  is  well-known  (Cohen  and  Bleistein  [3],  Beylkin  [1],  Rakesh  [11])  S  can  be  approxi¬ 
mated  rather  effectively  as  an  oscillatory  integral  (Fourier  Integral  Operator)  of  the  form 

5[ni)]r(xs;xr,t) 

S  f*  JdtA[v.](x,,xrM)^v‘]ix‘*rM)-t(0- 

The  notation  is  chosen  to  emphasize  the  following  points: 

(1)  The  seismogram  is  a  function  of  the  source  parameter  xs,  the  receiver  coordinate  xr, 
and  time  f; 

(2)  The  amplitude  or  symbol  A  and  the  phase  cj>  also  depend  on  a  wave  vector  £  of  the  same 
dimensionality  as  the  space  coordinates. 

(3)  A  and  <j>  depend  further  on  a  point  source  coordinate  x  and  are  convolved  in  x  and  t 
against  the  source  distribution  /(x,t,xs). 

(4)  The  seismogram  S,  the  symbol  A ,  and  the  phase  function  <j>  depend  functionally  on  Vb. 

(5)  The  seismogram  S  depends  linearly  on  the  reflectivity  r. 

For  space  dimension  n ,  the  symbol  A  behaves  for  large  |£|  like 

for  a  suitable  smooth  function  Aq  which  is  non-zero  over  a  sector  in  f/|£|  determined  by  the 
bicharacteristic  geometry  (hence  by  Vb).  The  phase  function  (f>  is  positively  homogeneous  of 
degree  1  in 

3  Least  Squares  Inversion 

With  the  conventions  established  so  far,  we  can  state  a  simple  version  of  the  least-squares 
inversion  problem: 

Find  Vb,  r  to  minimize 

Jls  [wf>>  r\  ‘S'dat aJ  =  2  J  J  J  dxsdxTdt  |5[t>h]  •  r  —  •S'dataJ 


Here  we  understand  the  integral  sign  to  mean  integral  or  sum,  as  appropriate. 

In  a  typical  reflection  seismic  model  in  2D,  Vb  might  be  represented  by  a  few  tens  of 
parameters,  while  r  requires  perhaps  105  —  106  parameters  for  a  useful  degree  of  resolution. 
Thus  the  least-squares  problem  is  computationally  very  large,  and  efficient  minimization 
algorithms  are  required.  By  far  the  most  efficient  numerical  optimization  techniques  are 
the  descent  methods  related  to  Newton’s  method  —  when  they  work.  These  iterations  take 
steps  predicted  by  the  linearized  model/data  relation  so  rely  for  their  effectiveness  on  a 
close  relation  between  the  cost  function  and  its  quadratic  approximation.  Accordingly,  we 
now  examine  (somewhat  formally)  the  response  of  S  to  perturbations  in  Vb  and  r. 


From  the  oscillatory  integral  expression  above  the  perturbation  of  S  due  to  a  change 
Sv b  in  Vb  is 

SVbS  =  J  d£(i6(j>  ■  A  +  SA)ei*r  . 

This  is  an  oscillatory  integral  of  the  same  form  as  that  approximating  S,  with  a  different 
symbol.  In  fact  S(f>  is  also  homogeneous  of  order  1,  exactly  as  is  4>.  Therefore  the  symbol  in 
the  above  integral  grows  as  |£|n"l’2  as  |£|  — *  oo,  i.e.  at  a  more  rapid  rate  than  A.  It  follows 
that  for  at  least  some  oscillatory  r,  smooth  Svb 

\SVbS\»\S\. 

Taking  this  reasoning  one  step  further,  one  sees  immediately  that 

|^5|  »  \6VbS\  , 

that  is,  that  S  is  very  nonlinear  in  Vb. 

To  appreciate  the  consequences  for  the  behaviour  of  the  cost  function  Jis ,  we  introduce 
the  factorization 

SVbS  =  SQi 

which  follows  from  the  calculus  of  Fourier  integral  operators  (e.g.  Duistermaat  [4]).  Here 
is  a  pseudodifferential  operator,  i.e.  an  oscillatory  integral  of  the  form 

Qi<t>(x)  =  J  #e’*'€9i(*,0^0  • 

The  symbol  q\  =  qi[u6>^ufc]  is  of  order  1  (i.e.  grows  like  |£|  for  large  |f|)  and  depends 
smoothly  on  Vb,  linearly  on  Svb .  Moreover,  Q *  is  essentially  skew-adjoint:  there  is  another 
pseudodiffereential  operator  Qo  of  order  zero  (i.e.  whose  symbol  q0  is  bounded  as  |f  |  — >  oo) 
so  that 

Qi  +  Qi  =  Qo 

where  the  superscript  “T”  denotes  the  transpose,  or  formal  adjoint. 

Note  in  general  that  for  a  pseudodifferential  operator  B  with  symbol  b  depending 
smoothly  on  a  parameter  a: 

B[a\u(x)  =  J  d£b(x,t,a)etx<u(Z) 

the  perturbation  of  B  with  respect  to  a  is  a  pseudodifferential  operator  of  the  same  order, 
i.e.,  whose  symbol  has  the  same  order  of  growth  as  |^|  — 0  as  does  b: 

SaB[a]u(x)  =  J  dtfab^T^ay^u^)  . 

This  feature  of  pseudodifferential  operators  is  due  to  the  fixed  nature  of  the  phase:  it  is 
always  £-x,  independent  of  the  parameters  on  which  the  symbol  depends.  Note  the  contrast 
with  the  behaviour  of  oscillatory  integrals  such  as  5,  i.e.  Fourier  integral  operators  with 
phases  depending  nontrivially  on  parameters  ( Vb  in  the  case  of  S). 

With  the  standard  notations 

i'l’,4)  =  J 

IIV’II  = 


(the  integration  being  over  appropriate  variables),  we  can  write 

Jls  [w6>  ri  ^dataJ  =  2  ~  ^datall 

fivbJLS  —  —  ■S'jata) 

—  {‘S'[^i>]Qi['w6)  fivb]ri  —  ^data) 

=  (r,gfJT(5r-5datJ). 

(We  have  dropped  arguments  as  convenient  to  make  the  structure  of  the  expressions 
clearer).  Similarly,  the  perturbation  of  Jls  with  respect  to  r  is  given  by 

Sr  Jls  =  <*r,  •S3’[S’--Sdata]>- 

Differentiating  once  more,  we  find  the  following  expressions  for  the  blocks  of  the  Hessian 
operator: 

Slb,vbJLS  = 

(r,f>vbQiST[Sr  -  Sdata]  +  QT1{QT1STS  +  STSQ1)r) 

Slb,rJLS  = 

(Sr,QjSTSr  +  STSQ1r  -  Q^STSd&u) 

SlrJlS  = 

(< 6r,STS8r )  =  ||S£r||2  . 

Recall  that  Q\  is  of  order  1,  hence  enhances  high-frequency  content.  Again  according  to 
the  calculus  of  oscillatory  integrals,  QfQf  is  of  order  2,  hence  enhances  high-frequency 
components  even  more  strongly.  Thus  one  might  well  expect  that 

^ vbrvb  Jls  |  ^  ^  ^r,r  Jls 

and  this  is  indeed  the  case  for  oscillatory  r,6r  of  the  same  magnitude  and  smooth  Vb,6vb. 
Thus  the  Hessian  is  extremely  ill-conditioned. 

Moreover,  the  growth  rate  of  Jls  as  one  moves  Vb  away  from  the  minimizer  is  many 
times  the  overall  size  of  Jls  itself.  Therefore  the  growth  cannot  be  sustained  over  a  large 
change  in  Vb,  and  Jls  saturates.  Consequently  Jls  tends  to  be  very  non-convex,  with  a 
very  small  region  of  convexity  near  the  global  optimum  model.  See  Symes  and  Carazzone 
[17],  Figure  4  for  an  actual  picture  of  Jls  illustrating  these  features. 

The  highly  non-quadratic  nature  of  Jls  explains  the  great  difficulty  of  recovery  of  vj, 
by  least-squares  inversion  reported  frequently  in  the  literature  (Gauthier  et  al.  [5],  Kolb 
et  al.  [9],  Mora  [8],  for  example  —  see  also  Santosa  and  Symes  [12]).  One  can  say  with 
confidence  that  extraction  of  Vb  by  means  of  least-squares  inversion  and  local,  Newton-type 
optimization  is  impossible,  unless  the  initial  estimate  if  Vb  is  so  accurate  as  to  render  further 
refinement  almost  pointless. 

Random  or  systematic  search  has  been  suggested  as  an  alternative  technique  (e.g.  Cao 
et  al..  [2]).  Such  methods  may  work  well  when  the  background  velocity  may  be  represented 
by  a  few  parameters  in  a  known  way.  In  general,  severely  parsimonious  parameterization 
is  likely  to  introduce  unjustified  bias,  and  to  fail  to  sample  the  model  space  sufficiently  to 
well- approximate  the  optimal  Vb .  On  the  other  hand,  refined  parameterization  generates 
impossibly  large  search  tasks. 

In  sum,  estimation  of  Vb  via  the  least-squares  principle  is  unlikely  to  yield  useful  results 
in  general,  or  reliable  inversion  methods. 


4  The  Coherency  Method 

Our  resolution  of  the  difficulty  outlined  in  the  preceding  paragraphs  begins  with  two  obser¬ 
vations: 

(i)  For  fixed  Vf,,  Jls  is  perfectly  convex  —  in  fact,  quadratic! 

(ii)  If  the  set  of  shot  parameter  values  {zs}  reduces  to  a  singleton ,  e.g.  only  one  point 
source  record  is  used,  the  minimum  value  of  Jls  is  essentially  independent  of  Vb . 

That  is,  the  inversion  of  a  single  shot  record  is  feasible,  and  constrains  only  r,  not  Vb . 
Since  this  task  is  practical,  it  suggests  the  expedient  of  viewing  r  as  a  function  of  the  shot 
parameter  xs 

r  =  r(x,xs)  . 

Of  course,  if  5data  is  noise  free, 

«data  =  SKK 

then  r(x,xs )  =  r*(x)  is  amongst  the  minimizers  of 

J  J  dxrdt\S[vt]r(-,xa)  -  Sdata(-,  xs)|2 

and  has  the  addition  property  of  coherence,  or  independence  of  xs,  which  we  can  express  as 


Only  coherent  reflectivity  estimates  have  any  ultimate  meaning,  since  there  is  only  one 
earth! 

The  above  two  conditions  can  be  combined  into  a  single  cost  functional,  for  instance: 

\\S[vb]r-Sd&tJ2  +  ^\\dr/dxs\\2 

where  r  is  now  allowed  to  depend  explicitly  on  xs  —  with  such  dependence  penalized  by 
the  second  term,  weighted  by  a  parameter  cr2. 

This  functional  is  quadratic  in  r,  so  the  minimization  with  respect  to  r  presents  no 
difficulties,  in  principle.  On  the  other  hand,  as  a  functional  of  both  Vb  and  r,  it  is  still 
quite  non-convex,  for  the  same  reasons  as  before.  Together  these  two  observations  suggest 
elimination  of  r:  that  is,  we  define  a  functional  of  Vb  only  by 

Jcm  K;  ■S'dataJ  = 

mrin  \  jll5M  •  r  ~  •S'datall2  +  a2 

It  is  a  remarkable  fact  that  this  functional  is  smooth  —  in  fact,  nearly  quadratic  —  in 
its  dependence  on  Vb ,  despite  its  rather  close  relation  with  the  least  squares  functional!  We 
also  conjecture  that  it  is  strongly  convex  for  near-consistent  data  •S'data  an^  ProPer  choice 
of  a 2,  over  a  large  subset  of  background  velocity  models.  We  are  able  to  give  a  proof  in  the 
plane/wave  layered  medium  case  [13]. 


Minimization  of  Jcm  over  a  smooth  class  of  background  velocities  vi,  is  the  coherency 
optimization  problem.  Note  that  for  noise-free  data,  Jcm  attains  the  value  0  for  Vf,  =  vj*, 
which  is  clearly  its  global  minimum,  and  that  this  minimum  is  reached  by  setting  r  —  r* 
on  the  right-hand  side.  That  is,  the  global  minimum  is  achieved  at  the  correct  velocity  — 
and,  implicitly,  at  the  correct  reflectivity. 

In  the  remainder  of  this  section,  we  will  outline  the  reasons  for  the  smoothness  of  Jcm-, 
and  our  reasons  for  thinking  that  Jcm  might  be  minimized  quite  efficiently.  We  give  only 
the  formal  skeletons  of  arguments  here;  precise  statements  and  proofs  will  be  presented 
elsewhere. 

Before  starting  we  take  care  of  a  few  technical  details.  The  first  is  that  the  normal 
operator 

STS 


is  a  pseudodifferential  operator  of  order  n— 1  if  the  source  is  impulsive,  f(x,  t )  =  S(x—x3)6(t), 
under  some  ray-geometric  restrictions  (no  caustics  in  the  incident  wavefront).  This  is  an¬ 
other  immediate  consequence  of  the  FIO  calculus  (Duistermaat)  [4]),  and  is  mentioned 
explicitly  in  Beylkin  [1],  Rakesh  [11]  for  example.  As  shown  in  Percell’s  thesis  [10],  this 
conclusion  is  false  when  caustics  are  present  in  the  incident  wave-front  —  a  generic  occur¬ 
rence  in  heterogeneous  media.  It  is  possible  to  recover  the  pseudodifferential  nature  of  STS 
by  modifying  the  definition  of  S.  Without  going  into  details,  we  assume  that  this  has  been 
done.  This  operator  is  elliptic ,  i.e.  acts  as  an  invertible  Fourier  multiplier  at  high  spatial 
frequencies,  over  a  conic  sector  of  wave  vectors  (the  “reflection  aperture”)  determined  by  the 
relative  positions  of  sources  and  receivers  and  the  ray  geometry  of  the  background  velocity 
field.  Outside  of  the  reflection  aperture,  which  varies  with  location  in  the  subsurface,  ST S 
suppresses  high-frequency  components  (these  correspond  to  “off-cable”  reflections).  There¬ 
fore  the  high-frequency  components  of  r  outside  the  inversion  aperture  must  be  constrained 
a  priori  in  solving  equations  involving  STS. 

To  accomplish  this  goal  in  a  well-scaled  way,  we  first  modify  the  definition  of  S:  we 
assume  that  the  source  has  point-support,  and  in  its  time  dependence  is  a  low  frequency 
perturbation  of  the  (^rO-th  derivative  of  6(t ):  thus 


f(x,t)  =  const. 


_ 3 

6(x  -  xs)t+ 2  ,  n  =  2 
S(x  -  xs)S(t )  ,  n  =  3 

+  fo{x,t) 


(the  distribution  t+A  is  defined  in  Gel’fand  and  Shilov  [6],  for  example),  where  /o  is  a 
smooth  function.  This  amounts  to  assuming  that  /,  while  bandlimited  below,  behaves 

_  3 

as  t+2  (n  —  2)  or  6(t)  ( n  =  3)  across  the  upper  part  of  the  passband  of  the  seismic 
signals.  Practically,  this  assumption  is  realized  by  preprocessing  the  data  to  re-scale  it  in 
the  frequency  domain. 

With  this  modification,  STS  is  a  pseudodifferential  operator  of  order  2:  specifically, 
STS  r(xs,x)  =  J  d£b(xs,x,t)etx<r(xs,£) 

where  b  ~  bo(xs,  x,  £/|£|)|£|2  within  the  reflection  aperture,  ~  0  outside  of  it,  as  |£|  — >  oo. 
Thus  STS  is  a  family  of  pseudodifferential  operators  in  x,  parameterized  by  xs,  of  order 
2.  It  is  a  slight  technical  headache  that  such  a  family  of  operators  is  not  a  pseudodifferen¬ 
tial  operator  in  x  and  xs;  however  this  is  not  an  essential  complication  (e.g.,  Taylor  [19], 
Appendix)  and  we  shall  ignore  it  here. 


We  chose  a  regularizing  operator  R,  pseudodifferential  of  order  2  in  r  and  depending 
parametrically  on  xs,  so  that 

STS  +  X2R 

is  elliptic  for  each  xs  as  long  as  A2  >  0.  A  simple  choice  is 


R  =  I  -  V* . 


This  choice  is  suboptimal,  as  it  also  affects  the  components  within  the  reflection  aperture, 
but  for  small  A2  this  is  probably  of  little  consequence.  It  will  be  important  in  the  sequel  to 
write  R  =  CTC,  with  C  a  pseudodifferential  operator  of  order  1.  This  is  certainly  possible 
for  the  simple  choice  just  given  with  C  =  (/  -  V2)? . 

Having  disposed  of  these  preliminaries  we  recall  that  the  coherency  method  functional 
Jcm  is  defined  by  minimizing  over  r  the  (regularized)  quadratic 


1 

2 


Sr-  Sdatal|2  +  >?(',&) +  *1 


dr 


dx. 


Our  goal  (and  an  important  step  in  the  proof  that  Jcm  is  smooth)  is  to  show  that  the 
derivative  6  Jcm  of  Jcm  with  respect  to  Vb  is  of  the  same  size  (roughly)  as  Jcm  itself.  Note 
the  contrast  with  the  behaviour  of  the  least-squares  functional  Jls  described  in  the  last 
section. 

A  minimizer  of  the  above  quadratic  is  a  solution  of  the  normal  equations 


N r  ~  [STS  +  Vfl  -  <r2  dl/dzDr  =  STSdata  . 


The  operator  N  is  (essentially)  an  elliptic  pseudodifferential  operator  of  order  2  in  x,xs. 
Standard  techniques  show  that  N  is  invertible,  under  reasonable  restrictions  on  r,  and  that 
r  depends  stably  on  Sdata  in  suitable  norms. 

Since  S  =  S[vb],  the  solution  of  the  normal  equations  also  depends  on  r  =  r[v;,, 
also.  The  dependence  of  r  on  Vb  is  quite  erratic  —  this  is  another  consequence  of  our  analysis 
of  the  least-squares  problem  in  the  last  section. 

With  these  conventions  we  can  calculate  the  derivative  of  Jcm  with  respect  to  vy. 


SJcm 


(SS  -r  +  S  -Sr,  Sr-  Sdata) 

+  **<*•,*•>  +  *’<  A- «r,  A.  r> 

<«£.>•,  Sr -Sdata) 


+  (Sr, 


|st5  +  A 2R  - 


r  ~  S^data) 


=  (SS  -r,Sr-  Sdata) 

+  (Sr,Nr-STSd ata)  • 


Here  Sr  is  the  (implicit)  derivative  of  r  =  rjv;,,  S^ata]  with  respect  to  Vb .  This  could  be 
computed  by  differentiating  the  normal  equations,  but  fortunately  this  effort  is  unnecessary: 
because  of  the  normal  equations  the  second  term  drops  out.  Thus 


SJcm  =  (6 Sr,  Sr  -  Sdata)  . 


Recalling  the  factorizations 

SS  =  SQl  and  R  =  CTC 

we  calculate 

(6Sr,Sr- Sd&ta)  = 

(g1r,5T(5r-5data)} 

=  (Q1r,(X2R- o2  d2/dx2s)r) 

(normal  equations  again!) 

=  (Qlr,X2CTC  +  o2(d/dxs)T(d/dxs)r) 

=  X2(CQir,  Cr )  +  a2(d/dxs  Qir,  d/dxsr) 

=  X 2{QxCr,Cr)  +  a2(Qi  dr/dxs,  dr/dxs) 

+  A2(  [C,  Qi]r,  Cr)  +  <r2(  [ d/dxs ,  Qx]r,  dr/dxs)  . 

Here  we  have  used  the  (standard)  notation  for  the  commutator  of  two  operators: 

[A,  B]  =  AB  -  BA  . 

The  calculus  of  pseudodifferential  operators  shows  that  the  commutator  of  a  pair  of  opera¬ 
tors  of  orders  p  and  q  is  an  operator  of  order  p  +  q  —  1.  That  is,  both 

[C,Qi]  and  [d/dx3,  Qi] 

are  of  order  1.  On  the  other  hand,  recall  that  Q\  is  essentially  skew-adjoint:  using  the 
notations  of  the  last  section 

{QiCr,  Cr)  =  \((Qi+Qj)Cr,Cr) 

=  \(QoCr,Cr) 

where  Q o  is  a  pseudodifferential  operator  order  zero.  Similarly, 

(Qi  dr/dxs,dr/dxs)  =  ^  (Q0  dr/dxs,dr/dxs)  . 

The  upshot  is  the  formula 

SJcm  =  (r,  [X\Ct[CMa\ctQ0C) 

-  cr2(d/dxs  [ d/dxs ,  Qi]  -(-  ^ d/dxs  Q0  d/dxs)]r) 

and  the  operator  in  brackets  on  the  right-hand  side  is  of  order  2.  On  the  other  hand,  Jcm 
itself  can  be  written  as 

Jcm  =  hr,  [S'S  +  \‘R  -  a2  il2!dz,}r  -  257'5'datai  +  jl!.S'T.?,|il,ai|2 
=  l(r.  Nr)  -  STSdaU)  +  ^IIS^JI2 

the  quadratic  part  of  which  also  involves  an  operator  of  order  2.  Therefore  SJcm  should  be 
of  roughly  the  same  size  as  Jcm-,  as  claimed. 


Note  the  difference  with  the  least-square  functional:  because  the  r  appearing  in  the 
definition  of  Jcm  solves  the  normal  equations,  we  were  able  to  reduce  the  order  in  frequency 
of  the  symbols  appearing  in  8  Jcm ,  via  symmetry  considerations. 

Further  analysis  along  these  lines  shows  that  the  formal  Hessian  82Jcm  is  also  given  by 
a  quadratic  form  in  r  defined  by  a  symmetric  pseudodifferential  operator  of  order  2.  This 
is  not,  of  course,  a  proof  that  82Jcm  is  a  well-conditioned  quadratic  form  in  8v &,  but  it  is 
certainly  a  step  in  that  direction. 

More  discussion,  including  an  outline  of  the  proof  that  Jcm  is  smooth  and  a  practical 
calculation  of  its  gradient,  may  be  found  in  Symes  [14],  which  also  contains  some  numerical 
investigations  of  Jcm ■  Examples  of  the  gradient  calculation,  and  an  initial  attempt  at 
velocity  inversion  by  optimization  of  Jcm ,  appear  in  Symes  [15]. 
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